Applying Natural Language Processing Techniques for Effective Persian- English Cross-Language Information Retrieval

نویسنده

  • H. Alizadeh
چکیده

Much attention has recently been paid to natural language processing in information storage and retrieval. This paper describes how the application of natural language processing (NLP) techniques can enhance cross-language information retrieval (CLIR). Using a semi-experimental technique, we took Farsi queries to retrieve relevant documents in English. For translating Persian queries, we used a bilingual machinereadable dictionary. NLP techniques such as tokenization, morphological analysis and part of speech tagging were used in pre-andpost translation phases. Results showed that applying NLP techniques yields more effective CLIR performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying Light Natural Language Processing to Ad-Hoc Cross Language Information Retrieval

In the CLEF 2005 Ad-Hoc Track we experimented with language-specific morphosyntactic processing and light Natural Language Processing (NLP) for the retrieval of Bulgarian, French, Italian, English and Greek.

متن کامل

Solving the Polysemy Problem of Persian Words Using Mutual Information Statistics

In recent years, large monolingual, comparable and parallel corpora have played a very crucial role in solving various problems of computational linguistics including machine translation, information retrieval, natural language processing, and the like. This paper tries to solve the problem of polysemy of Persian words while translating them into Persian by the computer. We use Mutual Informati...

متن کامل

German, French, English and Persian Retrieval Experiments at CLEF 2009

We describe evaluation experiments conducted by submitting retrieval runs for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2009. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant records or documents (with high p...

متن کامل

Automatic Term Extraction for Cross-Language Information Retrieval Using a Bilingual Parallel Corpus

Information retrieval is a crucial area of natural language processing (NLP) and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval refers to a kind of information retriev/al in which the language of the query and that of searched document are different. This paper tries to construct a bilingual lexicon from an English...

متن کامل

German, French, English and Persian Retrieval Experiments at CLEF 2008

We describe evaluation experiments conducted by submitting retrieval runs for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2008. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant records or documents (with high p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010